Performance Improvement Of Bengali Text Compression Using Transliteration And Huffman Principle

نویسندگان

  • Md. Mamun Hossain
  • Ahsan Habib
  • Mohammad Shahidur Rahman
چکیده

In this paper, we propose a new compression technique based on transliteration of Bengali text to English. Compared to Bengali, English is a less symbolic language. Thus transliteration of Bengali text to English reduces the number of characters to be coded. Huffman coding is well known for producing optimal compression. When Huffman principal is applied on transliterated text significant performance improvement is achieved in terms of decoding speed and space requirement compared to Unicode compression.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Leveraging Statistical Transliteration for Dictionary-Based English-Bengali CLIR of OCR'd Text

This paper describes experiments with transliteration of out-of-vocabulary English terms into Bengali to improve the effectiveness of English-Bengali Cross-Language Information Retrieval. We use a statistical translation model as a basis for transliteration, and present evaluation results on the FIRE 2011 RISOT Bengali test collection. Incorporating transliteration is shown to substantially and...

متن کامل

An Enhanced Static Data Compression Scheme Of Bengali Short Message

This paper concerns a modified approach of compressing Short Bengali Text Message for small devices. The prime objective of this research technique is to establish a lowcomplexity compression scheme suitable for small devices having small memory and relatively lower processing speed. The basic aim is not to compress text of any size up to its maximum level without having any constraint on space...

متن کامل

Revisiting Automatic Transliteration Problem for Code-Mixed Romanized Indian Social Media Text

Although automatic Transliteration for Indian languages is a well studied paradigm, but availab le t ransliteration techniques fail in the Indian social media context due to phenomena such as wordplay, creative spelling, codemixing, and phonetic romanized typing; all implying that transliteration for Indian social media text has to be revisited. The paper reports an init ial study on automatic ...

متن کامل

An Effective Approach for Compression of Bengali Text

In this paper, we propose an effective and efficient approach for compressing Bengali Text. This paper focuses on a methodical study on Bengali text compression techniques. The main target of this research is to provide a framework for Bengali text compression; which ensures a simple and computationally inexpensive effective scheme for Bengali text compression. The proposed Bengali text compres...

متن کامل

Design and Analysis of an Effective Corpus for Evaluation of Bengali Text Compression Schemes

In this paper, we propose an effective platform for evaluation of Bengali text compression schemes. A novel scheme for construction of Bengali text compression corpus has also been incorporated in this paper. A methodical study on the formulation-approaches of text corpus for data compression and present an effective corpus named Ekushe-Khul for evaluating the Bengali text compression schemes h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016